Method of Controlling an Image Capturing System, Image Capturing System and Digital Camera

ABSTRACT

A method of controlling an image capturing system comprising an interface for receiving an external trigger to capture an image, and an image capturing device provided with a photosensitive area and an array of pixel cells, each pixel cell including a device for generating a signal indicative of the intensity of light falling on an associated part of the photosensitive area, which image capturing device is further provided with readout circuitry for generating an array of pixel values to capture an image frame at a set spatial resolution, such that each pixel value represents an integral of the signal or signals generated in at least one of the pixel cells in an associated one of a number of areas over an exposure time interval, the number of areas being determined by the set spatial resolution, the areas together covering a region of the photosensitive area corresponding to a region in the image, comprises receiving an external trigger to capture an image, and, in response to the external trigger, directing the image capturing device to capture at least two image frames by generating respective arrays of pixel values representing integrals over respective consecutive exposure time intervals. The spatial resolutions of at least two of the captured image frames are set to different values.

CROSS-REFERENCE TO RELATED APPLICATION

The present application is a Section 371 National Stage Application ofand claims priority of International patent application Serial No.PCT/EP2005/052121, filed May 10, 2005, and published as WO 2006/119802in English.

BACKGROUND OF THE INVENTION

The invention relates to a method of controlling an image capturingsystem comprising an interface for receiving an external trigger tocapture an image, and an image capturing device provided with aphotosensitive area and an array of pixel cells, each pixel cellincluding a device for generating a signal indicative of the intensityof light falling on an associated part of the photosensitive area, whichimage capturing device is further provided with readout circuitry forgenerating an array of pixel values to capture an image frame at a setspatial resolution, such that each pixel value represents an integral ofthe signal or signals generated in at least one of the pixel cells in anassociated one of a number of areas over an exposure time interval, thenumber of areas being determined by the set spatial resolution, theareas together covering a region of the photosensitive areacorresponding to a region in the image.

The invention also relates to an image capturing system.

The invention also relates to a method of forming a combined final imagefrom a plurality of image frames, including the steps of: obtaining afirst and at least one further array of intensity values, each array ofintensity values encoding light intensity levels at each of a respectivenumber of pixel positions in the respective image frame, the numberdetermining the spatial resolution of the image frame concerned,generating a set of derived arrays of intensity values, each derivedarray being based on a respective one of the obtained arrays ofintensity levels and encoding light intensity levels at each of a commonnumber of pixel positions in at least a region of overlap of therespective image frames, generating an array of combined intensityvalues, each element in the array based on a sum of intensity valuesrepresented by the corresponding element in each of the respectivederived arrays of intensity values, and providing an array of intensityvalues encoding the combined final image, the array being based on thearray of combined intensity values.

The invention also relates to an image processing system.

The invention also relates to a digital camera.

The invention also relates to a computer program.

The aforementioned application describes a digital camera. The cameracan be used in a substantially stationary position to capture a sequenceof images and to derive a sequence of corresponding frames of pixelvalues representing the images. Each image is underexposed on purpose.The images are adjusted prior to forming them into a combined finalimage. The combined final image is formed by summing the values ofcorresponding pixels in the adjusted images. The combined final imagemay therefore be formed from underexposed images, but is itselfsufficiently bright, as well as having good spatial resolution. Theadjustment is used to prevent the combined final image from beingblurred.

A problem associated with capturing a series of underexposed imageframes for later combination is due to the types of image capturingdevices available for use. Generally, these either have pixel cellscomprising Charge Coupled Devices (CCDs) or are made with ComplementaryMetal Oxide Semiconductor (CMOS) sensors, in both cases with associatedread-out circuitry. In particular when CCD arrays are used, the readouttime, i.e. the time needed by the read-out circuitry to generate thearray of pixel values encoding a frame, is very long. The time needed tocapture a series of consecutive image frames for subsequent formation ofa combined image, is thus even longer. Setting the image spatialresolution to a lower value results in a lower spatial resolutioncombined image if interpolation techniques are used to increase thespatial resolution of the captured image frames. Reducing the number ofcaptured image frames on which to base the combined final image wouldachieve a lower total image capture time, but at the expense of adecreased signal-to-noise ratio (SNR) of the combined final image.

SUMMARY OF THE INVENTION

This Summary and Abstract are provided to introduce some concepts in asimplified form that are further described below in the DetailedDescription. This Summary and Abstract are not intended to identify keyfeatures or essential features of the claimed subject matter, nor arethey intended to be used as an aid in determining the scope of theclaimed subject matter. In addition, the description herein provided andthe claimed subject matter should not be interpreted as being directedto addressing any of the short-comings discussed in the Background.

Aspects of the invention include methods and systems of the type definedabove, that results in image frames for formation into a combined finalimage with a relatively low noise level whilst requiring a relativelyshort overall image capture time.

In one embodiment of capturing an image, a method includes receiving anexternal trigger to capture the image, and, in response to the externaltrigger, directing the image capturing device to capture at least twoimage frames by generating respective arrays of pixel valuesrepresenting integrals over respective consecutive exposure timeintervals, wherein the spatial resolutions of at least two of thecaptured image frames are set to different values.

Because at least two of the captured image frames have different spatialresolutions, there is always at least one image frame with a higher andat least one with a lower resolution. Capturing an image frame with ahigher resolution ensures that information with a high spatial frequencyis present in the combined final image. Because not all image frameshave the same, higher, resolution, the total time needed to capture andread out all the image data is relatively low, however. Because theexternal trigger, e.g. a user command, results directly in the captureof at least two image frames, the image frames in the series follow eachother as closely as possible, saving additional time. The change insettings to set a different resolution is also accomplishedautomatically in response to the external trigger. Because the imageframes are captured separately in such a manner that they may becombined into a combined final image by summation, the combined finalimage may be composed of image frames with relatively short exposuretimes, leading to a combined final image with little blur.

In a further embodiment, at least the lower of the spatial resolutionvalues is set by directing the image capturing device to generate anarray of pixel values in such a manner that each pixel value isrepresentative of the integral of the sum of the signals generated by atleast two devices in pixel cells.

Such a technique is commonly referred to as ‘binning’, and has theeffect of increasing sensitivity, because the two or more devices inpixel cells effectively occupy a larger part of the photosensitive area.Furthermore, the captured image frame has a lower noise level.

In yet a further embodiment, the method includes retrieving a desiredexposure time for a combined final image, determining the number ofimage frames to be captured, and for each image frame, calculatingsettings determining an exposure level applicable to the image frame,the settings including the length of the exposure time interval, whereinthe settings are calculated so that the sum of the lengths of theexposure time intervals over the number of image frames is equal to orless than the desired exposure time.

Each of the captured images is underexposed when viewed alone. Thecombined final image is not, however, because it is based on thecombined total of image frames. This embodiment has the advantage thatit enables addition of intensity levels representative of pixel valuesin the various image frames to generate one combined final image with acorrect exposure.

An embodiment includes the step of generating a set of arrays of pixelvalues, each based on one of the captured image frames, in such a mannerthat each encodes at least a region of an adjusted frame at the samespatial resolution.

This embodiment increases the suitability of the captured image framesfor generating a combined final image by summing corresponding pixelvalues in at least the regions of the adjusted frames.

According to another aspect of the invention, there is provided an imagecapturing system comprising an interface for receiving an externaltrigger to capture an image, an image capturing device provided with aphotosensitive area and an array of pixel cells, each pixel cellincluding a device for generating a signal indicative of the intensityof light falling on an associated part of the photosensitive area, whichimage capturing device is further provided with readout circuitry forgenerating an array of pixel values to capture an image frame at a setspatial resolution, such that each pixel value represents an integral ofthe signal or signals generated in at least one of the pixel cells in anassociated one of a number of areas over an exposure time interval, thenumber of areas being determined by the set spatial resolution, theareas together covering a region of the photosensitive areacorresponding to a region in the image, which image capturing systemcomprises a control system for controlling the operation of the imagecapturing device and for processing commands received through theinterface, wherein the control system is configured to, in response tothe external trigger, direct the image capturing device to capture atleast two image frames by generating respective arrays of pixel valuesrepresenting integrals over respective consecutive exposure timeintervals, wherein the control system is further configured to set thespatial resolutions of at least two of the captured image frames todifferent values.

In an embodiment, the image capturing system according to an aspect ofthe invention is configured to execute a method of capturing an imageaccording to an aspect of the invention.

According to another aspect of the invention, the method of forming acombined final image from a plurality of image frames includes the stepsof: obtaining a first and at least one further array of intensityvalues, each array of intensity values encoding light intensity levelsat each of a respective number of pixel positions in the respectiveimage frame, the number determining the spatial resolution of the imageframe concerned, generating a set of derived arrays of intensity values,each derived array being based on a respective one of the obtainedarrays of intensity levels and encoding light intensity levels at eachof a common number of pixel positions in at least a region of overlap ofthe respective image frames, generating an array of combined intensityvalues, each element in the array based on a sum of intensity valuesrepresented by the corresponding element in each of the respectivederived arrays of intensity values, and providing an array of intensityvalues encoding the combined final image, the array being based on thearray of combined intensity values, wherein a first array of intensityvalues encoding at least the region of overlap at a higher resolutionthan the further arrays of intensity values is obtained, an array ofintensity values encoding at least the region of overlap in the combinedfinal image at a higher spatial resolution than the further arrays ofintensity values is provided, and the array of intensity values encodingthe combined final image is based on a sufficient number of intensityvalues in the first array of intensity values to encode the region ofoverlap at a higher resolution than the further arrays of intensityvalues.

The method has the advantage of resulting in a combined final image witha relatively high resolution without requiring a large number of imageframes of the same resolution. Because each element in the array ofcombined intensity values is based on a sum of intensity valuesrepresented by the corresponding element in each of the respectivederived arrays of intensity values, the step of generating this array ofcombined intensity values removes noise. Because the array of intensityvalues is based on the array of combined intensity values, at leastpartially, the beneficial effect extends to the combined final image.Therefore, the combined final image has at once a relatively highspatial resolution and low noise level.

A first embodiment of the method includes obtaining first and furtherarrays of intensity values in which each intensity value represents alight level in an area surrounding a pixel position, wherein at leastone derived array of intensity values is obtained by adjusting thenumber of intensity values in an array by a multiplication factor, suchthat each derived array encodes at least the region of overlap at thesame spatial resolution.

This embodiment has the effect of enabling the step of generating anarray of combined intensity values to be performed by straightforwardsummation in the space-domain.

In a variant of this embodiment, the number of intensity values in atleast one array based on an obtained further array of intensity valuesis adjusted by a multiplication factor larger than one.

Thus, at least one low-resolution image frame is converted to a higherresolution. This is an effective way of ensuring that the array ofintensity values encoding the combined final image is based on asufficient number of intensity values in the first array of intensityvalues, since a sub-set, or in one embodiment all, of the intensityvalues in the first obtained array can simply be added to theircounterparts in the arrays obtained by adjustment to obtain a weightedaverage. The array of combined intensity values also encodes the finalimage.

In a second embodiment, each derived array of intensity values isgenerated by transforming an image frame encoded by an array ofintensity values based on one of the obtained arrays of intensity valuesand in which each intensity value represents a light level in an areasurrounding a pixel position in an image frame, into the spatialfrequency domain, such that each intensity value in a derived array ofintensity values represents an intensity of a spatial frequencycomponent of the image frame.

This embodiment has the advantage that it is not necessary to expandimage frames with a low spatial resolution in order to be able to carryout the step of generating an array of combined intensity values. Inparticular, interpolation is avoided. Instead, each derived array ofintensity values includes low-frequency components of the image framesthat are derivable from each of the obtained image frames. Relativelyfew additions are thus required to generate the array of intensityvalues encoding at least the region of overlap in the combined finalimage.

In a variant, the step of providing the array of intensity valuesencoding the combined final image includes replacing at least oneintensity value representing a low spatial frequency component in thederived array of intensity values based on the first obtained array ofintensity values by an intensity value based at least partly on theintensity value representing the corresponding spatial frequencycomponent in the array of combined intensity values.

Each replacement value may be based on the value it replaces, in orderto prevent the occurrence of a ringing effect. Irrespective of this,this variant is a particularly efficient way to arrive at a combinedfinal image based on a sufficient number of intensity values in thefirst array of intensity values to encode the region of overlap at ahigher resolution than the further arrays of intensity values. Itsuffices to transform the derived array based on the first array back tothe space-domain subsequent to replacing the intensity valuesrepresenting low spatial frequency components. In the thus obtainedcombined final image, the high-frequency information is derived from thefirst array, whereas the low-frequency information is a combination ofthe low-frequency information in the first and further arrays

In a variant, the transformation is carried out by a co-processorcomprising at least a partial implementation in hardware of an imagecompression algorithm or by a digital signal processor programmed toimplement an image compression algorithm.

This variant is particularly suited to implementation in a digitalcamera or other type of image processing equipment, which commonlycomprise such a co-processor. Since many compression algorithms involvethe use of a form of entropy coding for which transformation into thespatial frequency domain is required, this variant is very efficient.

In an embodiment, the step of generating an array of combined intensityvalues is preceded by a step of aligning the image frames, such thateach derived array encodes light intensity levels at each ofsubstantially corresponding pixel position in at least the region ofoverlap.

This ensures that the combined final image is relatively sharp, since‘fuzziness’ due to misalignment of the image frames encoded by theobtained arrays of intensity values is avoided. Such misalignment is aptto occur where the arrays of intensity values are obtained by means of adigital camera taking pictures of a scene in succession. Of course,‘fuzziness’ due to trembling of objects or persons in the scene is alsoremoved.

In an embodiment, at least one array of intensity values, based on anobtained array of intensity values encoding at least the region ofoverlap in the respective image frame at a higher spatial resolutionthan at least one further array of intensity values, is subjected to adigital filter operation having a characteristic of passing high spatialfrequency components of the image encoded by the array.

Because the higher resolution image also has a higher noise level, butis only really needed to provide image information with a high spatialfrequency, the noise level of the combined final image at lowerfrequencies is thus reduced. This is advantageous because the human eyeis most sensitive at relatively low spatial frequencies.

According to another aspect, an aspect of the invention provides animage processing system for forming a combined final image from aplurality of image frames, which image processing system includes anarrangement for loading a first and at least one further array ofintensity values, each array of intensity values encoding lightintensity levels at each of a respective number of pixel positions inthe respective image frame, the number determining the spatialresolution of the image frame concerned, and a data processingarrangement for processing the intensity values, wherein the system isconfigured to direct the data processing arrangement to perform thesteps of generating a set of derived arrays of intensity values, eachderived array being based on a respective one of the obtained arrays ofintensity levels and encoding light intensity levels at each of a commonnumber of pixel positions in at least a region of overlap of therespective image frames, generating an array of combined intensityvalues, each element in the array based on a sum of intensity valuesrepresented by the corresponding element in each of the respectivederived arrays of intensity values, and providing an array of intensityvalues encoding the combined final image, the array being based on thearray of combined intensity values, wherein the system is configured toload a first array of intensity values encoding at least the region ofoverlap at a higher resolution than the further arrays of intensityvalues, to provide an array of intensity values encoding at least theregion of overlap in the combined final image at a higher spatialresolution than the further arrays of intensity values, and to base thearray of intensity values encoding the combined final image on asufficient number of intensity values in the first array of intensityvalues to encode the region of overlap at a higher resolution than thefurther arrays of intensity values.

In one embodiment, the image processing system is configured to directthe processor to execute a method according of forming a combined finalimage according to an aspect of the invention.

According to another aspect, the invention provides a computer programconfigured, when loaded into a programmable processing device to enablethe programmable processing device to carry out a method according to anaspect of the invention.

According to another aspect, the invention provides a digital cameracomprising an image capturing system and/or an image processing systemaccording to an aspect of the invention.

BRIEF DESCRIPTION OF THE DRAWINGS

The invention will now be explained in further detail with reference tothe accompanying drawings, in which

FIG. 1 shows schematically the layout of an exemplary digital camera;

FIG. 2 shows in very schematic fashion some components of an imagecapturing device in the camera;

FIG. 3 is a flow diagram illustrating a method of capturing image framesand forming a combined final image;

FIGS. 4A-4C show in very schematic fashion arrays of intensity valuesillustrating how the combined final image is formed in one embodiment;and

FIG. 5 is an illustration of noise levels of the image frames relativeto the sensitivity of the human eye.

DETAILED DESCRIPTION OF THE ILLUSTRATIVE EMBODIMENTS

One example of an image processing system usable in the context of themethods outlined herein is a digital camera 1. Other examples include aphotocopier or scanning device.

The digital camera 1 comprises a lens system 2 for focusing on one ormore objects in a scene. When a shutter 3 is opened, the scene isprojected through an aperture 4 onto a photosensitive area 5 (FIG. 2) ofan image-capturing device 6. The shutter time is controllable, as is thediameter of the aperture. As an alternative, or addition, to the shutter3, the image capturing device could be electronically controlled toprovide the same effect (electronic shutter). The image-capturing device6 can be device implemented in Complementary Metal-Oxide Semiconductor(CMOS) technology, or a Charge-Coupled Device (CCD) sensor. The methodoutlined herein are ideally suited to CCD sensors, which have theadvantage of being cheap to manufacture, but have inherently longread-out times.

Referring to FIG. 2, the photosensitive area 5 is divided into areasoccupied by pixel cells 7 a-i, of which only nine are shown for clarity.Each pixel cell 7 includes a device for generating a signal indicativeof the intensity of light to which the area that it occupies within thephotosensitive area 5, is exposed. The device is, as stated, in oneembodiment, a CCD sensor. It is noted that the devices occupying thepixel cells 7 a-i are generally provided as components of one integratedcircuit. An integral of the signal generated by a device is formedduring exposure, for example by accumulation of photocurrent in acapacitor. Subsequent to exposure of the photo-sensitive area 5 for theduration of an exposure time interval, the values of the integrals ofthe generated signals are read out by means of row selection circuit 8and column selection and readout circuit 9.

It is noted that, for simplicity, this description will not focus on theway in which colour images are captured. It is merely observed that anyknown type of technology can be used, such as colour filters, acolour-sensitive variant of the image capturing device 6, etc. In thisrespect, it is also observed that the photosensitive area 5 need not bethe surface area of an integrated circuit comprised in animage-capturing device, or at least not for all colour components.Furthermore, although in the present application, image frames will besaid to be captured consecutively, this does not preclude embodiments,wherein image frames of different colour components are captured inorder, so that ‘consecutively’ captured image frames detailing onecolour component are alternated by those detailing other colourcomponents.

The output of the column select and read-out circuit 9 is provided inthe form of one or more analog signals to an Analog-to-Digital converter(A/D-converter) 10. The A/D-converter 10 samples and quantises thesignals received from the image capturing device 6, i.e. records it on ascale with discrete levels, the number of which is determined by thenumber of bits of resolution of the digital words provided as output bythe A/D converter 10. The A/D converter 10 provides as output an arrayof pixel values encoding a captured image frame.

A Digital Signal Processor (DSP) 11 performs such features asinterpolation between pixels and optionally compression of the image.Each exposure of the image-capturing device during an exposure timeinterval results in at least one frame.

The digital camera 1 comprises a storage device 12 for storing the imagedata encoding the captured images or image frames. The storage devicecan be any usual type of storage device, e.g. built-in flash memory,inserted flash memory modules, a disk drive with a floppy disk, aPCMCIA-format hard disk, or an optical disk drive.

A microprocessor 13 controls the operation of the digital camera 1, byexecuting instructions stored in non-volatile memory, in this example aRead-Only Memory (ROM) 14. The instructions in ROM 14, in someembodiments in combination with routines programmed for execution by DSP11, enable the digital camera 1 to execute the image processing andcapturing methods outlined in the present application.

Advantageously, the microprocessor 13 communicates with a co-processor15 in which at least part of an image compression algorithm isimplemented in hardware. Algorithms to compress images in accordancewith the JPEG-standard are usable, for example. As part of thecompression algorithm, the image data is transformed into the spatialfrequency domain. The co-processor 15 executes at least thistransformation, using a Discrete Cosine Transform (DCT) in most cases.

Indications of the operating conditions and settings of the digitalcamera 1 are provided on an output device 16, for example a LiquidCrystal Display, possibly in combination with a sound-producing device(not illustrated separately).

An input device 17 is shown schematically as being representative of thecontrols by means of which the user of the digital camera providescommands. In addition, the digital camera 1 illustrated in FIG. 1comprises a flash driver circuit 18 for providing appropriate drivingsignals to one or more sources of flash lighting. The illustrateddigital camera 1 also comprises a motion sensor 19, for providing asignal representative of the movement of the digital camera 1, and thusof the image-capturing device 6. Furthermore, the digital camera 1comprises an exposure metering device 20. The purpose of the exposuremetering device 20 is to measure the strength of the ambient light, sothat the microprocessor 13 can determine the intensity of light to beemitted by any connected flash, in combination with the correct valuesfor the settings determining the exposure, which include the exposuretime interval for each captured image frame, as will be elaborated onbelow.

It will be noted that the density of the areas occupied by the pixelcells 7 a-i determines the maximum attainable spatial resolution of acaptured image frame. The readout time depends on the number of pixelcells. It can be relatively long in embodiments such as the oneillustrated in FIG. 2, because each row is selected in turn using rowselection circuit 8, whereupon the column selection and readout circuit9 senses the values of the accumulated photocharge stored in thephotodevices in the pixel cells in that row. To reduce the total timeinvolved in repeatedly exposing the photosensitive area and capturing animage frame, the spatial resolution is set to a different value betweenexposures.

The microprocessor 13 defines a number of cluster areas 21, whichtogether cover a region corresponding to a region of interest in thecombined final image. The number is smaller than the number of pixelcells 7 a-i that together cover the region. Thus, a cluster of pixelcells occupies each defined cluster area 21, as is schematicallyillustrated in FIG. 2. For the sake of clarity, not all pixel cells 7are shown. To capture an image frame at a lower spatial resolution, onepixel value per cluster area 21 is read out. To capture an image frameat the highest possible spatial resolution, one pixel value per pixelcell 7 is read out. Incidentally, although the cluster areas 21 asillustrated have been defined such as to partition the photosensitivearea 5, the microprocessor 13 may alternatively or additionally define anumber of overlapping areas which together cover a region of thephotosensitive area 5 corresponding to the region of interest.Alternatively, areas may be defined with a slight spacing between them.To avoid having to carry out compensatory processing, the defined areaseach can surround regularly distributed pixel positions.

In one embodiment, the microprocessor 13 controls the image-capturingdevice 6 in such a manner that the one pixel value read out per clusterarea 21 represents an integral of the signal generated in one of thepixel cells 7 that lie within the cluster area 21. This embodiment hasthe virtue that it can be used with any type of image-capturing device6.

In one embodiment, the image-capturing device 6 has the capability to“bin” the outputs of multiple pixel cells. In this embodiment, themicroprocessor 13 directs the image-capturing device 6 to generate anarray of pixel values (each value being associated with one of thedefined cluster areas 21) in such a manner that each pixel value isrepresentative of the integral of the sum of the signals generated by atleast two device in pixel cells that occupy the same defined clusterarea 21. In this shown embodiment, this could mean that the pixel valuefor one cluster area 21 is the sum, or alternatively the average, of theintegrals of the signal generated by all nine of the shown pixel cells 7a-7 i. This embodiment is advantageous, because it increases thesensitivity. Effectively, each pixel value represents the amount oflight that fell on the whole of a defined cluster area 21, instead ofjust on the area occupied by one pixel cell 7. Thus, smaller lightfluxes are detectable. Furthermore, binning decreases the amount ofnoise, i.e. leads to a low resolution image with a higherSignal-to-Noise-Ratio (SNR). As the binning capability is a function ofthe image-capturing device that is implemented in hardware, it does notadd appreciably to the read out time. In one embodiment, the number ofimage frames that are captured at the highest resolution is equal to,but in some cases lower than, the number of image frames captured atlower spatial resolutions. A combined final image formed on the basis ofsuch a series of image frames will have a good SNR.

In yet a further embodiment, upon receiving a command from a user tocapture an image, the microprocessor 13 controls the digital camera 1 tocarry out a series of steps 22-25. In one example, the command isalternatively received from a device (not shown) connected to thedigital camera 1 through a suitable interface. This device issues anexternal trigger to start the execution of the illustrated steps. A userof the digital camera 1 may input a desired exposure time for a combinedfinal image, together with settings determining the amount of flashlight, the diameter of aperture 4 and the sensitivity of thephotodevices in the pixel cells 7. In alternative embodiments, themicroprocessor determines one or more of these values automatically,using a signal output by the exposure metering device 20, and possiblyone or more pre-defined combinations of values. Subsequently, themicroprocessor 13, upon receiving a command actually to capture thecombined final image, executes a first step 22 of capturing a number ofimage frames. This step 22 comprises retrieving the desired exposuretime for the combined final image, determining the number of imageframes to be captured and, for each image frame, calculating exposuresettings determining an exposure level applicable to the image frame.The settings include the exposure time interval for the frame. In somecases, the other settings are determined such as to result in exposuretime intervals for the image frames that, together, are shorter than thedesired exposure time for the combined final image. It is noted that theembodiment in which “binning” is carried out allows a reduction in theexposure time interval applicable to the image frames, because binningincreases the sensitivity. Effectively, ‘binning’ results in theintroduction of an extra amplification of the photo-electric signal. Themicroprocessor 13 advantageously takes account of this. It calculatesthe length of the exposure time interval applicable to the image frameat a lower spatial resolution value in dependence on the spatialresolution value, i.e. the amount of ‘binning’.

When calculating the settings determining the exposure levels applicableto the image frames, the microprocessor 13 preferably implements one ormore of the methods outlined in international patent applicationPCT/EP04/051080. That is, they are calculated such that the totalexposure level that is determined as desirable for the combined finalimage is unevenly distributed over the image frames. The passages inthat application relating to the stepping of exposure levels are herebyincorporated by reference, and recapitulated briefly.

As mentioned above, the exposure level is determined by the exposuretime, aperture, (flash) lighting intensity, and the amplifier gain in apixel cell. It is further determined by the A/D conversion threshold ofthe A/D converter 10. Stepping the amplification used to amplify anoutput of the photodevice in each pixel cell 7 has the advantage of easyimplementation. In alternative embodiments, the exposure time for imageframes of the same resolution is varied. In other embodiments, themaximum intensity of light admitted onto the photosensitive area isvaried per image frame, for example by adjusting the size of theaperture 4, or the intensity of the flash controlled through the flashdriver circuit 18.

In a first embodiment, the size of the aperture 4, as well as thelighting conditions, are kept constant between exposures. The desiredexposure time for the combined final image is unevenly distributed overthe image frames. In one embodiment, the number of image frames isselected to keep the exposure time interval for each image frame below acertain threshold level. For instance, this threshold level ispre-determined at 1/60 second, as this is considered the lowest shutterspeed to capture a steady image for the average photographer.

In one variant, the exposure time is varied randomly between frames. Inanother embodiment, settings of the image capturing system, in this casethe exposure time, are adjusted before several further captures of aframe in such a manner that at least a maximum of the scale on whichintensity values for each pixel are recorded changes substantiallyuniformly in value with each adjustment. This has the advantage ofresulting in a more accurate capture of the colour and tonal depth inthe combined final image. In each case, where binning is used to adjustthe spatial resolution between captured frames as well, the impact onthe exposure level is taken into account.

In an alternative embodiment, the size of the aperture 4 is adjustedbetween two successive captures of an image frame in such a manner thatat least a maximum of the scale on which intensity values for each pixelare recorded changes substantially uniformly in value with eachadjustment. The exposure level is stepped down in equal increments byadjusting the aperture area. If no binning is applied, then the aperturearea is stepped down in equal increments. Otherwise it is scaled withthe multiplication factor resulting from basing each pixel value on thesignals from multiple pixel cells.

In yet another embodiment, the intensity of artificial light used toilluminate a scene to which the image-capturing device 6 is exposed isdecreased in steps. Where the resolution decreases simultaneously, theintensity of artificial light is decreased by increasing amounts.

Embodiments combining one or more of the techniques described in thepreceding paragraphs are also conceivable.

Following the first step 22 in which the image frames are captured, thearrays of pixel values encoding the image frames are cached in a secondstep 23. Following the second step 23, they are aligned and processed ina third step 24. The combined final image resulting from the third stepis stored in storage device 12 in a final step 25. Although the presentdescription will now continue on the assumption that the digital camera1 carries out all of the steps 22-25, the third and fourth steps 24,25could be carried out in a separate image processing system, for examplea personal computer or workstation. In that case, the second step wouldinvolve committing the generated arrays of pixel values to storage inthe storage device 12 or transferring them to the computer via a datalink (not shown).

Two embodiments of a method of forming a combined final image asperformed in the course of executing steps 23 and 24 are described belowby way of example. They have in common that arrays of intensity valuesare obtained as input. The arrays of intensity values encode lightintensity values at each of a respective number of pixel positions inthe respective image frame, the number determining the spatialresolution of the image frame concerned. A set of derived arrays ofintensity values is generated, each derived array being based on arespective one of the obtained arrays of intensity levels and encodinglight intensity levels at each of a common number of pixel positions inat least a region of overlap of the respective image frames. In a firstembodiment, the derived arrays encode the image frame in the spacedomain; in the second embodiment, the derived arrays encode the imageframe in the spatial frequency domain. It is observed that the term‘derived array’ is not intended to signify that an array of values isstored as a data construct in memory. It is sufficient thatcorresponding elements of a notional array are available for summationat a certain point in time. Thus, each i^(th) element of each derivedarray should be available concurrently. This allows for the generationof an array of combined intensity values, in which each i^(th) elementis based on a sum—which may be a weighted sum—of the i^(th) elements ofthe derived arrays. The values of the latter represent intensity valuesin the space or spatial frequency domain, as the case may be. It isfurther observed that the derived arrays may correspond fully to theobtained arrays. This would be the case if the obtained arrays arealready in the right domain for summation, and encode respective imageframes that are already aligned, for instance. Both embodiments furtherhave in common that an array of intensity values encoding the combinedfinal image is provided as output, this array being based on the arrayof combined intensity values obtained by summation. In some embodiments,it is actually identical to the array of combined intensity values.

In a first embodiment, as part of either the second step 23 or the thirdstep 24, the set of captured image frames is converted in a set ofadjusted image frames encoded by a corresponding set of arrays of pixelvalues. In this first embodiment, each pixel value in an array ofintensity values encoding a captured image frame represents a lightlevel in an area 21 surrounding one of a number of pixel positions. Thenumber of pixel positions is proportional to the spatial resolution ofthe image frame, because the sizes of the respective image frames arethe same. Each array of intensity values derived subsequently encodes anadjusted image frame based on one of the captured image frames. Each isgenerated in such a manner that each encodes at least a region of anadjusted frame at a desired resolution that is the same for each of thederived arrays. The region may correspond to the entire image frame,incidentally. Each array encoding an adjusted image frame is generatedin such a manner that corresponding pixel values encoding the region inthe arrays represent respective light level in an area surroundingsubstantially the same pixel position. That is to say, the i^(th) pixelvalue of the pixel values of each array that encode the same region ofinterest corresponds to the same pixel position in each array for allvalues of i corresponding to a pixel position in the region of interest.

Because the spatial resolution differs between captured image frames,the spatial resolution of at least one of them must be adjusted by amultiplication factor, at least in the region of interest. Otherwise, itwould not be possible to achieve the characteristic that each array ofpixel values encoding an adjusted frame encodes at least the region ofinterest at the same spatial resolution. Preferably, the resolution ofthe lower-resolution frames is increased. This results in a combinedfinal image with the highest possible perceived spatial resolution whenthe pixel values encoding the region are summed to form the combinedfinal image. Any known technique to increase the spatial resolution maybe applied, for instance interpolation.

Then, the derived arrays of pixel values, encoding the image framesadjusted in resolution are used to generate an array of combined pixelvalues. Each element in this array is the sum of the correspondingelements of the derived arrays. In one example, the sum is a weightedsum. For example, the weights may be inversely related to the exposuretimes of the image frames. In another example, each combined pixel valueis an average of the corresponding pixel values. The thus formed arrayis provided as output.

A second embodiment of the method of forming a combined final image isillustrated in FIGS. 4A-4C. A first array 26 of pixel values encodeslight intensity levels at each of a respective number of pixel positionsin a first image frame. Each intensity value represents a light level inan area surrounding a pixel position. The same is true for a secondarray 27 of pixel values, encoding a second image frame. The first andsecond image frames represent the same captured scene. It will beassumed herein that the image frames encoded by the first and secondarrays 26,27 have previously been aligned. There are known methods foraligning images to sub-pixel resolution, for example using samplepoints.

The first array 26 of pixel values is divided into four blocks 28-31.The second array 27 of pixel values is divided into the same number ofblocks 32-35. A first block 28 in the first array 26 corresponds to afirst block 32 in the second array 27, i.e. represents a substantiallyoverlapping section of the respective image frame. In the same manner, asecond block 29 in the first array 26 corresponds to a second block 33in the second array, a third block 30 corresponds to a third block 34and a fourth block 31 to a fourth block 35 in the second array 27. Eachof the blocks 32-35 in the second array 27 will be proportionallysmaller in terms of the number of pixel values comprised therein thanthe corresponding one of the blocks 28-31 in the first array 26. In theexample illustrated in FIG. 4A, the low-resolution image frame isrepresented by the second array 27 comprising blocks of 2×2 pixelvalues, whereas the high-resolution image frame is represented by thefirst array 26, having 8×8 pixel values per block 28-31. Only the pixelvalues in the first blocks 28, 32 are shown in FIG. 4A.

A discrete cosine transform into the spatial frequency domain isperformed on a block-by-block basis. A first array 36 of DCTcoefficients (FIG. 4B) is obtained by performing the DCT on the firstarray 26 representing the first image frame in the space domain. Asecond array 37 of DCT coefficients is obtained by performing a DCT onthe second array 27. The first array 36 and second array 37 of DCTcoefficients encode light intensity levels at each of a respectivenumber of pixel positions in the respective image frame, only then inthe spatial frequency domain. The number of DCT coefficients determinesthe spatial resolution.

In a next step, four DCT coefficients 38 a-38 d representing the lowestfrequency components of the intensity distribution are derived from thefirst array 36 of DCT coefficients. Four DCT coefficients 39 a-39 drepresenting the components at the same frequency in the second array 37of DCT coefficients are derived from that array. As the image frameshave previously been aligned, the derived arrays represent lightintensity levels at each of a common number of pixel positions in thefirst and second image frames. In the presented example, each derivedarray comprises four elements.

In a next step, an array 40 of DCT coefficients encoding a combinedimage (in the spatial frequency domain) is generated. The array 40 isalso divided into four blocks 41-44. A first block 41 is based on thefirst blocks 28, 32, a second block 42 on the second blocks 29,33, athird block 43 on the third blocks 30,34, and a fourth block 44 is basedon the values in the fourth blocks 31,35. Four DCT coefficients 45 a-drepresent the lowest frequency components of the section of the combinedfinal image encoded by the first block 41. They are each based on a sumof intensity values represented by the values 38 a-d, 39 a-d of thecorresponding elements in the arrays derived from the first and secondarrays 36,37 of DCT coefficients. This could be done via an addition oraveraging process. The remaining DCT coefficients in the first block 41are based solely on those in the corresponding block of DCT coefficientsin the first array 36 of DCT coefficients. They are thus basedindirectly on the pixel values in the first block 28 of the first array26 of pixel values. Thus, the combined final image is encoded at ahigher resolution than the image frame represented by the second array27 of pixel values. In the illustrated example the spatial resolution ofthe combined final image corresponds to that of the first image frame.In alternative embodiments, it has a value in between that of the firstand second image frames.

The embodiment illustrated in FIGS. 4A-4C has a number of features thatmake implementation in the digital camera 1 attractive. There is no needto scale up the second array 27 of pixel values to the same number ofpixel values as the first array 26. The interpolation that is thusavoided is particularly processing intensive, requiring a relativelypowerful microprocessor 13. The number of additions required to carryout summation of intensity values is also relatively limited, as onlythe derived arrays of DCT coefficients 38 a-d, 39 a-d are added.Nevertheless, the combined final image is encoded at a resolution thatis higher than that at which the low-resolution images on which it isbased are encoded. This is the case because it is based on a sufficientnumber of the pixel values in the first array 26.

In an advantageous implementation, the transformation into the spatialfrequency domain is carried out by using the co-processor 15. It will berecalled that the co-processor 15 comprises an implementation inhardware of an image compression algorithm, for example to generateJPEG-compressed images. The microprocessor 13 is thus spared from havingto compute the DCT coefficients in the first and second arrays 36,37.

In one variant, the co-processor 15 converts the first and second arrays26,27 to the JPEG-format, whereupon the array of combined intensityvalues is generated. In another embodiment, the co-processor 15 returnsthe DCT coefficients, with the microprocessor 13 carrying out theremaining steps in the method. In one implementation, the DCTcoefficients are obtained by passing a null coefficient table for theentropy coding that is normally part of the image compression algorithm.

To achieve the property that each derived array encoding an adjustedimage frame is generated in such a manner that corresponding intensityvalues encoding the region in the arrays represent light levels in anarea surrounding the same pixel position, alignment using one or more ofthe methods outlined in PCT/EP04/051080 is advantageously applied. Thisapplies equally to both embodiments of the combination method presentedabove. Relevant passages of that document are herein incorporated byreference. The step of alignment precedes the summation step in bothembodiments illustrated herein. In the second embodiment, it generallyprecedes also the transformation into the spatial frequency domain.Without the alignment, the arrays of intensity values encoding theregion in the adjusted image frames could still be said to have theproperty that they encode light levels in areas surroundingsubstantially the same one of a number of pixel positions, only thedegree of correspondence in pixel position is slightly less due to theeffect of camera shake.

Following adjustment to align the image frames and provided them withthe same spatial resolution, the combined final image is formed. This isdone by forming an array of pixel values encoding the region in acombined final image, such that each pixel value in the formed array isthe sum of the corresponding pixel values in the arrays of pixel valuesencoding the region in the adjusted image frames.

It has been outlined above that the captured image frames with higherresolution have a higher noise level, and that binning reduces the noiselevel. This is visible in FIG. 5, which also illustrates theadvantageous effects of a noise shaping technique that can be used. Thedashed line surrounding a left-most area 46 delimits the boundaries ofthe range of frequency information contained in the binned,lower-resolution image, as well as indicating the noise level. Thedashed and dotted line surrounding a right-most area 47 does the samefor a higher resolution image to which a high-pass digital filter hasbeen applied. The digital high-pass filter may be applied prior toadjustment of the spatial resolution and/or alignment, or subsequentthereto. Without the application of the high-pass filter, the right-mostarea 47 would extend to the lower frequencies, at the same noise level.A continuous curve 48 representing the sensitivity of the eye of a human(or animal for that matter), demonstrates that the noise level of thehigher-resolution image frame at lower frequencies would have beenperceptible. The noise shaping achieved by means of capturing separatelow-resolution image frames and high-resolution image frames and bysubjecting the latter to a high-pass filter results in a combined finalimage with an acceptable noise level at all spatial frequencies.

The invention is not limited to the embodiments described above, whichmay be varied within the scope of the attached claims. The number ofdifferent levels of spatial resolution employed to capture the imageframes for one combined final image can be two or higher. High-passfiltering and summation of pixel values may be carried out in an imageprocessing system external to the digital camera 1. Alternatively, allsteps prior to the actual summation of pixel values to form the array ofpixel values encoding the combined final image may be carried out in thedigital camera. Such adjusted image frames are then stored in thedigital camera 1 for subsequent transfer to a computer or other imageprocessing system. Furthermore, instead of underexposing each capturedimage, the gain of an amplifier between the output of the imagecapturing device (CCD or CMOS) and the A/D converter can be set veryhigh. This results in an image with visible noise. The exposure is“correct”, but the image has a lower quality than would be the case witha slower exposure. The methods outlined above improve the image qualityin such an embodiment

1. Method of controlling an image capturing system comprising aninterface for receiving an external trigger to capture an image, and animage capturing device provided with a photosensitive area and an arrayof pixel cells, each pixel cell including a device for generating asignal indicative of the intensity of light falling on an associatedpart of the photosensitive area, which image capturing device is furtherprovided with readout circuitry for generating an array of pixel valuesto capture an image frame at a set spatial resolution, such that eachpixel value represents an integral of the signal or signals generated inat least one of the pixel cells in an associated one of a number ofareas over an exposure time interval, the number of areas beingdetermined by the set spatial resolution, the areas together covering aregion of the photosensitive area corresponding to a region in theimage, which method comprises receiving an external trigger to capturean image, and, in response to the external trigger, directing the imagecapturing device to capture at least two image frames by generatingrespective arrays of pixel values representing integrals over respectiveconsecutive exposure time intervals, wherein the spatial resolutions ofat least two of the captured image frames are set to different values.2. Method according to claim 1, wherein at least the lower of thespatial resolution values is set by directing the image capturing deviceto generate an array of pixel values in such a manner that each pixelvalue is representative of the integral of the sum of the signalsgenerated by at least two devices in pixel cells.
 3. Method according toclaim 1, including retrieving a desired exposure time for a combinedfinal image, determining the number of image frames to be captured, foreach image frame, calculating settings determining an exposure levelapplicable to the image frame, the settings including the length of theexposure time interval, wherein the settings are calculated so that thesum of the lengths of the exposure time intervals over the number ofimage frames is equal to or less than the desired exposure time. 4.Method according to claim 2, wherein at least the length of the exposuretime interval applicable to the image frame at the lower of the spatialresolution values is calculated in dependence of the spatial resolutionvalue.
 5. Method according to claim 1, including the step of generatinga set of arrays of pixel values, each based on one of the captured imageframes, in such a manner that each encodes at least a region of anadjusted frame at the same spatial resolution.
 6. Image capturing systemcomprising an interface for receiving an external trigger to capture animage, an image capturing device provided with a photosensitive area andan array of pixel cells, each pixel cell including a device forgenerating a signal indicative of the intensity of light falling on anassociated part of the photosensitive area, which image capturing deviceis further provided with readout circuitry for generating an array ofpixel values to capture an image frame at a set spatial resolution, suchthat each pixel value represents an integral of the signal or signalsgenerated in at least one of the pixel cells in an associated one of anumber of areas over an exposure time interval, the number of areasbeing determined by the set spatial resolution, the areas togethercovering a region of the photosensitive area corresponding to a regionin the image, which image capturing system comprises a control systemfor controlling the operation of the image capturing device and forprocessing commands received through the interface, wherein the controlsystem is configured to, in response to the external trigger, direct theimage capturing device to capture at least two image frames bygenerating respective arrays of pixel values representing integrals overrespective consecutive exposure time intervals, wherein the controlsystem is further configured to set the spatial resolutions of at leasttwo of the captured image frames to different values.
 7. Image capturingsystem according to claim 6, wherein the control system is configured toexecute a method according to claim
 1. 8. Method of forming a combinedfinal image from a plurality of image frames, including the steps of:obtaining a first and at least one further array of intensity values,each array of intensity values encoding light intensity levels at eachof a respective number of pixel positions in the respective image frame,the number determining the spatial resolution of the image frameconcerned, generating a set of derived arrays of intensity values, eachderived array being based on a respective one of the obtained arrays ofintensity levels and encoding light intensity levels at each of a commonnumber of pixel positions in at least a region of overlap of therespective image frames, generating an array of combined intensityvalues, each element in the array based on a sum of intensity valuesrepresented by the corresponding element in each of the respectivederived arrays of intensity values, and providing an array of intensityvalues encoding the combined final image, the array being based on thearray of combined intensity values, wherein a first array of intensityvalues encoding at least the region of overlap at a higher resolutionthan the further arrays of intensity values is obtained, an array ofintensity values encoding at least the region of overlap in the combinedfinal image at a higher spatial resolution than the further arrays ofintensity values is provided, and the array of intensity values encodingthe combined final image is based on a sufficient number of intensityvalues in the first array of intensity values to encode the region ofoverlap at a higher resolution than the further arrays of intensityvalues.
 9. Method according to claim 8, including obtaining first andfurther arrays of intensity values in which each intensity valuerepresents a light level in an area surrounding a pixel position,wherein at least one derived array of intensity values is obtained byadjusting the number of intensity values in an array by a multiplicationfactor, such that each derived array encodes at least the region ofoverlap at the same spatial resolution.
 10. Method according to claim 9,wherein the number of intensity values in at least one array based on anobtained further array of intensity values is adjusted by amultiplication factor larger than one.
 10. Method according to claim 8,wherein each derived array of intensity values is generated bytransforming an image frame encoded by an array of intensity valuesbased on one of the obtained arrays of intensity values and in whicheach intensity value represents a light level in an area surrounding apixel position in an image frame, into the spatial frequency domain,such that each intensity value in a derived array of intensity valuesrepresents an intensity of a spatial frequency component of the imageframe.
 12. Method according to claim 11, wherein the step of providingthe array of intensity values encoding the combined final image includesreplacing at least one intensity value representing a low spatialfrequency component in the derived array of intensity values based onthe first obtained array of intensity values by an intensity value basedat least partly on the intensity value representing the correspondingspatial frequency component in the array of combined intensity values.13. Method according to claim 11, wherein the transformation is carriedout by a co-processor comprising at least a partial implementation inhardware of an image compression algorithm, or by a digital signalprocessor programmed to implement an image compression algorithm. 14.Method according to claim 8, wherein the step of generating an array ofcombined intensity values is preceded by a step of aligning the imageframes, such that each derived array encodes light intensity levels ateach of substantially corresponding pixel position in at least theregion of overlap.
 15. Method according to claim 8, wherein at least onearray of intensity values, based on an obtained array of intensityvalues encoding at least the region of overlap in the respective imageframe at a higher spatial resolution than at least one further array ofintensity values, is subjected to a digital filter operation having acharacteristic of passing high spatial frequency components of the imageencoded by the array.
 16. Image processing system for forming a combinedfinal image from a plurality of image frames, which image processingsystem includes an arrangement for loading a first and at least onefurther array of intensity values, each array of intensity valuesencoding light intensity levels at each of a respective number of pixelpositions in the respective image frame, the number determining thespatial resolution of the image frame concerned, and a data processingarrangement for processing the intensity values, wherein the system isconfigured to direct the data processing arrangement to perform thesteps of generating a set of derived arrays of intensity values, eachderived array being based on a respective one of the obtained arrays ofintensity levels and encoding light intensity levels at each of a commonnumber of pixel positions in at least a region of overlap of therespective image frames, generating an array of combined intensityvalues, each element in the array based on a sum of intensity valuesrepresented by the corresponding element in each of the respectivederived arrays of intensity values, and providing an array of intensityvalues encoding the combined final image, the array being based on thearray of combined intensity values, wherein the system is configured toload a first array of intensity values encoding at least the region ofoverlap at a higher resolution than the further arrays of intensityvalues, to provide an array of intensity values encoding at least theregion of overlap in the combined final image at a higher spatialresolution than the further arrays of intensity values, and to base thearray of intensity values encoding the combined final image on asufficient number of intensity values in the first array of intensityvalues to encode the region of overlap at a higher resolution than thefurther arrays of intensity values.
 17. Image processing systemaccording to claim 16, configured to direct the processor to execute amethod according to claim
 8. 18. Computer program configured, whenloaded into a programmable processing device to enable the programmableprocessing device to carry out a method according to claim
 1. 19.Digital camera comprising an image capturing system and/or an imageprocessing system according to claim 6.